Effective & Efficient Document Ranking without using a Large Lexicon

نویسنده

Yasushi Ogawa

چکیده

Although a word-based method is commonly used in document retrieval, it cannot be directly applicable to languages that have no obvious word separator. Given a lexicon, it is possible to identify words in documents, but a large lexicon is troublesome to maintain and makes retrieval systems large and complicated. This paper proposes an effective and efficient ranking that does not use a large lexicon; words need not be identified during document registration because a character-based signature file is used for the access structure. A user request, during document retrieval, is statistically analyzed to generate an appropriate query, and the query is evaluated efficiently in a wordbased manner using the character-based index. We also propose two optimizing techniques to accelerate retrieval.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Construction of an Opinion-Term Vocabulary for Ad Hoc Retrieval

We present a method to automatically generate a term-opinion lexicon. We also weight these lexicon terms and use them at real time to boost the ranking with opinionated-content documents. We define very simple models both for opinion-term extraction and document ranking. Both the lexicon model and retrieval model are assessed. To evaluate the quality of the lexicon we compare performance with a...

متن کامل

RRLUFF: Ranking function based on Reinforcement Learning using User Feedback and Web Document Features

Principal aim of a search engine is to provide the sorted results according to user’s requirements. To achieve this aim, it employs ranking methods to rank the web documents based on their significance and relevance to user query. The novelty of this paper is to provide user feedback-based ranking algorithm using reinforcement learning. The proposed algorithm is called RRLUFF, in which the rank...

متن کامل

Ranking efficient DMUs using the variation coefficient of weights in DEA

One of the difficulties of Data Envelopment Analysis(DEA) is the problem of deciency discriminationamong efficient Decision Making Units(DMUs) and hence, yielding large number of DMUs as efficientones. The main purpose of this paper is to overcome this inability. One of the methods for rankingefficient DMUs is minimizing the Coefficient of Variation (CV) for inputs-outputs weights. In this pape...

متن کامل

Cross Language Text Categorization Using a Bilingual Lexicon

With the popularity of the Internet at a phenomenal rate, an ever-increasing number of documents in languages other than English are available in the Internet. Cross language text categorization has attracted more and more attention for the organization of these heterogeneous document collections. In this paper, we focus on how to conduct effective cross language text categorization. To this en...

متن کامل

Learned Lexicon-Driven Interactive Video Retrieval

We combine in this paper automatic learning of a large lexicon of semantic concepts with traditional video retrieval methods into a novel approach to narrow the semantic gap. The core of the proposed solution is formed by the automatic detection of an unprecedented lexicon of 101 concepts. From there, we explore the combination of query-by-concept, query-by-example, query-bykeyword, and user in...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1996

Effective & Efficient Document Ranking without using a Large Lexicon

نویسنده

چکیده

منابع مشابه

Automatic Construction of an Opinion-Term Vocabulary for Ad Hoc Retrieval

RRLUFF: Ranking function based on Reinforcement Learning using User Feedback and Web Document Features

Ranking efficient DMUs using the variation coefficient of weights in DEA

Cross Language Text Categorization Using a Bilingual Lexicon

Learned Lexicon-Driven Interactive Video Retrieval

عنوان ژورنال:

اشتراک گذاری